108 results found.
Written
Corpus,
Language Type:
Multilingual
Languages:
Chinese English Japanese
Availability:
Freely Available
License:
Size:
10M sentences Production Status:
Newly created-in progress
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:SpanAlign: Sentence Alignment Method based on Cross-Language Span Prediction and ILP
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Katsuki Chousa | JParaCrawl | /N |
Documentation:
None
Written
Encyclopedia,
Language Type:
Multilingual
Languages:
Bulgarian Chinese English French
Availability:
Freely Available
License:
Size:
None Production Status:
Existing-used
Use:
Corpus Creation/Annotation
-
Paper title:Document Sub-structure in Neural Machine Translation
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Radina Dobreva | Wikipedia dump | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
Bulgarian Chinese English French
Availability:
Freely Available
License:
CC-BY-SA 3.0 licence
Size:
None Production Status:
Newly created-finished
Use:
Machine Translation, SpeechToSpeech Translation
-
Paper title:Document Sub-structure in Neural Machine Translation
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Radina Dobreva | Parallel Wikipedia Biographies with Sub-structure | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
Chinese English Japanese
Availability:
Freely Available
License:
Size:
None Production Status:
Existing-used
Use:
Language Modelling
-
Paper title:GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errors
-
Paper track:Written/oral presentation
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Masato Hagiwara | W2C - Web To Corpus | /N |
Documentation:
None
Written
Tagger/Parser,
Language Type:
Multilingual
Languages:
Chinese English German Spanish french
Availability:
Freely Available
License:
GNU General Public License
Size:
384 MByte Production Status:
Existing-used
Use:
Question Answering
-
Paper title:High Accuracy Rule-based Question Classification using Question Syntax and Semantics
-
Paper track:Syntactic and Semantic Parsing, Grammar Induction
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country | ||
|---|---|---|---|---|---|
| Author 1 | Harish Tayyar Madabushi | University of Birmingham | GB | ||
| Author 2 | Mark Lee | University of Birmingham | N/A | School of Computer Science, University of Birmingham, UK | None |
| Main Contact | Harish Tayyar Madabushi | University of Birmingham | None |
Documentation:
http://nlp.stanford.edu/nlp/javadoc/javanlp/
Written
Corpus,
Language Type:
Multilingual
Languages:
Chinese
Availability:
From Data Center(s)
License:
LDC
Size:
6.08 MByte Production Status:
Existing-used
Use:
Parsing and Tagging
-
Paper title:Graph-based Dependency Parsing with Bidirectional LSTM
-
Paper track:Empirical/Data-Driven
-
Paper status:Accept - Poster - Tuesday
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Wenhui Wang | Institute of Computational Linguistics Dept of Computer Science & Technology, Peking University | CN |
| Author 2 | Baobao Chang | Institute of Computational Linguistic, Peking Univerisity | CN |
| Main Contact | Wenhui Wang | Institute of Computational Linguistics Dept of Computer Science & Technology, Peking University | None |
Documentation:
<Not Specified>Language Type:
Multilingual
Languages:
Chinese
Availability:
Freely Available
License:
<Not Specified>
Size:
7 KByte Production Status:
Existing-updated
Use:
Discourse
-
Paper title:Chinese Tense Labelling and Causal Analysis
-
Paper track:Linguistic Issues in NLP
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Hen-Hsen Huang | Department of Computer Science and Information Engineering, National Taiwan University | TW |
| Author 2 | Chang-Rui Yang | National Taiwan University | N/A |
| Author 3 | Hsin-Hsi Chen | National Taiwan University | TW |
| Main Contact | Hen-Hsen Huang | Department of Computer Science and Information Engineering, National Taiwan University | None |
Documentation:
Yes. In English. Yes.
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese Dutch Finnish French German Greek Hungarian Japanese Russian Spanish
Availability:
Freely Available
License:
Apache-2.0
Size:
None Production Status:
Existing-used
Use:
Speech Synthesis
-
Paper title:One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
-
Paper track:7.14 Cross-lingual and multilingual aspects in spe/Poster Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Tomáš Nekvinda | CSS10 | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese Dutch French German Russian
Availability:
Freely Available
License:
Creative Commons CC0
Size:
None Production Status:
Existing-updated
Use:
Speech Synthesis
-
Paper title:One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech
-
Paper track:7.14 Cross-lingual and multilingual aspects in spe/Poster Presentation
-
Paper status:Accept - Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Tomáš Nekvinda | Cleaned Common Voice | /N |
Documentation:
None
Embeddings and validation dkctionaries,
Language Type:
Multilingual
Languages:
Chinese English Japanese Turkish
Availability:
License:
Size:
2.8 GB Production Status:
Use:
Embeddings for reproduction results of unsupervised machine translation
-
Paper title:A Closer Look on Unsupervised Cross-lingual Word Embeddings Mapping
-
Paper track:Evaluation/poster presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Kamil Pluciński | Datasets used in attached paper | /N |
Documentation:
None




